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Abstract 

In medical research, it is common to collect information of multiple continuous biomarkers 
to improve the accuracy of diagnostic tests. Combining the measurements of these biomarkers 
into one single score is a popular practice to integrate the collected information, where the 
accuracy of the resultant diagnostic test is usually improved. To measure the accuracy of 
a diagnostic test, the Youden index has been widely used in literature. Various parametric 
and nonparametric methods have been proposed to linearly combine biomarkers so that the 
corresponding Youden index can be optimized. Yet there seems to be little justification of 
enforcing such a linear combination. This paper proposes a flexible approach that allows 
both linear and nonlinear combinations of biomarkers. The proposed approach formulates 
the problem in a large margin classification framework, where the combination function is 
embedded in a flexible reproducing kernel Hilbert space. Advantages of the proposed approach 
are demonstrated in a variety of simulated experiments as well as a real application to a liver 
disorder study. 
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1 Introduction 


In medical research, continuous biomarkers have been commonly explored as diagnostic tools to 
distinguish subjects, such as diseased and non-diseased groups [[H. The accuracy of a diagnos¬ 
tic test is usually evaluated through sensitivity and specificity, or the probabilities of true posi¬ 
tive and true negative for any given cut-point. Particularly, the receiver operating characteristic 
(ROC) curve is defined as sensitivity versus 1—specificity over all possible cut-points for a given 
biomarker [|2l[3]|, which is a comprehensive plot that displays the influence of a biomarker as the 
cut-point varies. To summarize the overall information of an ROC curve, different summarizing 
indices have been proposed, including the Youden index dH and the area under the ROC curve 
(AUC; 0). 

The Youden index, defined as the maximum vertical distance between the ROC curve and the 
45° line, is an indicator of how far the ROC curve is from the uninformative test [l3]|. Normally, it 
ranges from 0 to 1 with 0 for an uninformative test and 1 for an ideal test. The Youden index has 
been successfully applied in many clinical studies and served as an appropriate summary for the 
diagnostic accuracy of a single quantitative measurement (e.g., 0|6l|71|). 

It has been widely accepted by medical researchers that diagnosis based on one single biomarker 
may not provide sufficient accuracy [[81 HI. Consequently, it is becoming more and more common 
that multiple biomarker tests are performed on each individual, and the corresponding measure¬ 
ments are combined into one single score to help clinicians make better diagnostic judgment. In 
literature, various statistical modeling strategies have been proposed to combine biomarkers in a 
linear fashion. For instance, Su and Liu [flOl derived the analytical results of optimal linear combi¬ 
nation based on AUC under multivariate normal assumption. Pepe and Thompson [fTTll proposed to 
relax the distributional assumption and perform a grid search for the optimal linear combination, 
while its computation becomes expensive when the number of biomarkers gets large. Recently, 
a number of alternatives were proposed to alleviate the computational burden. For instances, the 
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min-max approach IfT^ combines only the minimum and maximum values of biomarker measure¬ 
ments linearly; the stepwise approaeh [[T3l eombines all biomarker measurements in a stepwise 
manner. By targeting direetly on the optimal diagnostie aeeuraey, Yin and Tian [fT4l extended 
these two methods to optimize the Youden index and demonstrated their improved performanee in 
a number of numerieal examples. 

In reeent years, nonlinear methods have been popularly employed to eombine multiple biomark¬ 
ers in various fields, ineluding genotype elassifieation IITSll . medieal diagnosis [fT^ . and treatment 
seleetion [|20l . In this paper, a new model-free approaeh is proposed and formulated in a large 
margin elassifieation framework, where the biomarkers are flexibly eombined into one single diag¬ 
nostie seore so that the eorresponding Youdex index [|4| is maximized. Speeifieally, the eombina- 
tion funetion is modeled non-parametrieally in a flexible reprodueing kernel Hilbert spaee (RKHS; 
mx where both linear and nonlinear eombinations eould be aeeommodated via a pre-speeified 
kernel funetion. 

The rest of the paper is organized as follows. In Seetion 2, we provides some preliminary baek- 
ground of eombining multiple biomarkers based on the Youden index. In Seetion 3, we diseuss the 
motivation for flexible eombinations and formulate the proposed flexible approaeh in a framework 
of large margin elassifieation for eombining multiple biomarkers. In Seetion 4, we eonduet numer¬ 
ieal experiments to demonstrate the advantages of the proposed approaeh. In Seetion 5, we apply 
the proposed approaeh to a liver disorder study. Seetion 6 eontains some diseussion. 

2 Preliminaries 

Suppose that every subjeet has m biomarker measurements X = (X(i),X( 2 ),... with a 

probability density funetion /(X), where Xq) is a eontinuous measurement of the j-th biomarker. 
It also has a binary response variable Y e {1,-1} indieating the subjeet is diseased or not. 
In literature, researehers from different fields [|8l [T4l| have diseussed and explored the valid- 
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ity of combining m biomarker measurements into one single seore funetion g(X.) as a more 
powerful diagnostic tool. A subject is diagnosed as diseased if the eombined score g(X.) is 
higher than a given eut-point c, and non-diseased otherwise. To summarize its diagnostie accu- 
raey, the Youden index is eommonly used in praetiee. With sensitivity and speeifieity defined as 
sen( 5 (,c) = Pr{g{X.) > c\Y = 1) and spe(( 7 , c) = Pr{g{X.) < c\Y = —1) respeetively, the 
Youden index is formulated as 


J = max {sen{g, c) + spe{g, c) — 1}. 


The Youden index normally ranges from 0 to 1, where J = 1 eorresponds to a perfeet separation, 
and J = 0 corresponds to a random guess. 

To estimate the Youden index, various modeling strategies have been proposed. Sehisterman 
et al. [|22ll provided a elosed form for the Youden index assuming the eonditional distribution 
of X|Y = ±1 follows a multivariate Gaussian distribution. Further relaxing the distributional 
assumption, kernel smoothing teehniques were adopted by Yin and Tian llT4l and Fluss et al. [l23l . 
where the sensitivity and speeifieity were estimated in a nonparametrie fashion. 

Note that the formulation of J can be rewritten as 


J = max w{l)Pr(g(X.) > c,Y = l) + w(—l)Pr(g(X) < c,Y = —l) — 1 

g,c ^ 



( 1 ) 


max 

9,c 


where tc(l) = I/tt, tc(—1) = 1/(1 — tt), tt = Pr{Y = 1), and sign(M) = 1 if m > 0 and —1 
otherwise. Denote the ideal eombination funetion g* (x) and eut-point c* as the ones that maximize 
J over all possible funetionals and eut-points. Following the proof of Proposition 1 in lITSll . the 
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ideal g* (x) and c* must satisfy 


sign(^*(x) - c*) = sign (p(x) - tt) , (2) 

where p(x) = Pr{Y = l|x) is the eonditional probability of disease given the biomarker mea¬ 
surements. 


3 Linear or nonlinear combination 


In Q, the ideal 5 '*(x) and c* are defined based on p(x) that is often unavailable in praetiee. Henee 
the expeetation in Q needs to be estimated based on the given sample (xj, 2/i)7=i- Speeifieally, a 
natural estimate J ean be obtained as 


J = max - sign(^((xi) - c)) - 1 

g,c n 


%=\ 


max V(1 -f sign( 5 ((xi) - c)) -f 

9,c 


5-1 


^ (1 - sign( 5 ((xi) - c)) - 1, (3) 

ie5-i 


where w(l) = l/vr = n/|5i|, w(—1) = n/|5_i|, S\ = {i ■. yi = 1}, 5_i = {z : i/j = —1}, and | ■ | 
denotes the set eardinality. 

The optimization in Q is generally intraetable without a speeified eandidate spaee of In 
literature, linear funetional spaee g{x) = /3^x is often used [fTOl [TTl W2\. [T3l [T4ll . mainly due to 
its eonvenient implementation and natural interpretation. Yet there seems to be laek of seientifie 
support for the use of linear eombination of biomarkers. 

Consider a toy example, where tt = 1/2, X|y = 1 ~ iV 2 ((l, 1)^, 12 ) and X|Y = —1 ~ 
^2((0,0)^, I 2 ), where I 2 is a 2-dimensional identity matrix. Then for any given x. 


p(x) 


/(x|Y = 1) 

/(x|F = l) + /(x|y = -l) 


1 

1 + 
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where x = (a;(i), a:( 2 ))^. Thus, the ideal eombination of biomarkers 5 '*(x) ean take the linear 
form g*{x.) = xi + X2, leading to sign{g*{x.) — c) = sign (p(x) — 1/2) with c = 1. However, 
if the biomarkers are heteroeedastie in the positive and negative groups, the ideal eombination 
would be no longer linear. For instanee, when X|y = 1 ~ iV 2 ((l,l)'^,/ 2 ) butX|F = -1 - 
7V2((0,0r,2/2), 


p(x) 


/(x|y = 1) 

/(x|F = l) + /(x|F = -l) 


2 

2 + gl-(^(l)+*(2)) + (2^(l)+*(2))/4' 


-\-x^ 

Clearly, the ideal eombination of biomarkers is a quadratie funetion g*{'x.) = — (x(i) +a;( 2 )) 

with c = log(2) — 1. Furthermore, if the eonditional distribution X|F is unknown, then the ideal 
eombination of biomarkers may take various forms, and thus a pre-speeified assumption on linear 
eombination ean be too restrietive and lead to suboptimal eombinations. 


3.1 Model-free estimation formulation 

To allow more flexible g{x) than linear funetions, it is natural to optimize ([^ over a bigger fune- 
tional spaee eonsisting of nonlinear funetions. Note that the objeetive funetion in Q involves a 
sign operator, whieh makes it diseontinuous in g and thus diffieult to optimize in general lITTII . 
Alternatively, note that Q ean be simplified as 

1 "" 

min -y^w{yi)(l-sign.{ui)), 
g,c n ' 

2=1 

where Ui = yi{g{'x.i) — c). As proposed in Xu et al. ifT^ . a surrogate -^^-loss, defined as 


Ls{u) 


min <j -(5 - m) + , 1 \ , 
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can be employed to replaee the 0-1 loss Lqi{u) = 1 — sign(M) in the objeetive funetion. The V’ 5 - 
loss extends the ^p-loss [fTTl [T9l by introdueing a parameter 5 that eontrols the differenee between 
the surrogate loss and the 0-1 loss. Figure [^displays the 0-1 loss, the ^p-loss and the 5 -loss as 
funetions of u. 


Figure 


about here. 


Furthermore, denote = {x ; g{x) — c > 0 and |p(x) — 7r| > e}. Propositionshows 
that for any e > 0, the V^^-loss is asymptotieally Fisher eonsistent in estimating 'Dg*^c*,e when 6 
approaehes 0. 


Proposition! Given any e > 0, let ((y'|,c|) = aigmirig^^ E(w{Y)Ls{Y{g(X.) — c))), then as 

6 ^ 0 , 


Pr (T>g|,c^,eAPg.^c*,e) 0, 


where A denotes the symmetric difference of two sets. 

With the '05-loss, the proposed model-free estimation framework for (fi'(x), c) is formulated as 


1 J ^ 

™ - y'w(|/i)L 5 (j/i(^(xi) - c)) + XJ{g), (4) 

i=l 

where A is a tuning parameter, 1-Lk is set as a RKHS assoeiated with a pre-speeified kernel fune¬ 
tion Ff(-, •), and J{g) = is the RKHS norm penalizing the eomplexity of (/(x). The 

popular kernel funetions inelude the linear kernel K (u, v) = u^v, the m-th order polynomial ker¬ 
nel i^(u, v) = (1 -f u^v)”^, and the Gaussian kernel iT(u, v) = exp{ —||u — v|p/2r^} with a 
seale parameter . When the linear kernel is used, the resultant 1-Lk eontains all linear funetions; 
when the Gaussian kernel is used, 1-Lk beeomes mueh rieher and admits more flexible nonlinear 
funetions. 

More interestingly, the representer theorem [1211 implies that the solution to Q must be of 
the form ^(x) = and thus = a^Ka with a = (ai, • • • , and K = 
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(/C(xj, The representor theorem greatly simplifies the optimization task by turning the 

minimization over a funetional spaee into the minimization over a finite-dimensional veetor spaee. 
Speeifieally, the minimization task in Q beeomes 


mm s[a 

aeR",ceK 



(5) 


where a = (a^, c)^ is an {n + l)-dim veetor. 

The minimization task in ([^ involves a non-eonvex funetion Ls{-), and thus we employ the 
differenee eonvex algorithm (DCA; [f24l l to taekle the non-eonvex optimization task. The DC A 
deeomposes the non-eonvex objeetive funetion in to the differenee of two eonvex funetions, and 
iteratively approximates it through a refined eonvex objeetive funetion. It has been widely used for 
non-eonvex optimization and delivers superior numerieal performanee [|^ FTSl |20]| . The detail of 
solving @ is similar to that in [fTSll and thus omitted here. 

4 Simulation examples 

This seetion examines the proposed estimation method for eombining biomarkers in a number 
of simulated examples. The numerieal performanee of the proposed kernel maehine estimation 
(KME) method is eompared against some existing popular alternatives, ineluding the min-max 
method (MMM) IfT^ . the parametrie method under multivariate normality assumption (MVN) 
Il26l . the non-parametrie kernel smoothing method (KSM) with Gaussian kernel [fT4l . the stepwise 
method (SWM) [|T3l . and the other two olassifieation methods in [fTSlI . the logistie regression (LR) 
and the elassifieation tree (TREE). 

Eor illustration, the kernel funetion used in all methods is set as the linear kernel K{zi, Z 2 ) = 
zfz 2 and the Gaussian kernel K{zi, Z 2 ) = where the seale parameter is set as 

the median of pairwise Euelidean distanees between the positive and negative instanees within the 
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training set. The tuning parameter A for our proposed method is selected by 5-fold cross validation 
that maximizes the empirical Youden index 


f Y. Hyi = < c) Y iiyi = < c)\ 
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( 6 ) 


where /(■) is an indicator function and 14 is the validation set of fc-th folder. The maximization is 
conducted via a grid search, where the grid for selecting A is set as s = 1, • • ■ , 81}. 

The optimal solutions of MVN and KSM are searched by routine optim() in R as suggested in Ying 
and Tian lfT4l . SWM and MMM are based on the grid search with the same grid. TREE is tuned by 
default in R. Eurthermore, for the proposed KME method, 6 is set as 0.1 for all simulated examples 
as suggested in Hedayat et al. llTTll . 

Eour simulated examples are examined. Example 1 is similar to Example 5.1.1 in [|T4ll . Exam¬ 
ple 2 modifies Example 1 by using multivariate Gamma distribution, which appears to be a popular 
model assumption in literature ^T2\ . Examples 3 and 4 are similar to Setting 2 in [|20ll and Example 
11(b) in [f30ll . which simulate data from logistic models with nonlinear effect terms. 

Example 1. A random sample {(Xj, Yi)]i = 1, ■ ■ ■ , n} is generated as follows. Eirst, Yi is 
generated from Bernoulli(0.5). Second, if 1} = 1, then X, is generated from MVN(^^, Si), 
where = (0.4,1.0,1.5,1.2)^ and Si = O. 3/4 -f O. 7 J 4 with J 4 a 4-dimensional identity matrix 
and J 4 a 4 X 4 matrix of all I’s; if 1} = —1, then Xj is generated from MVN(^i 25 ^ 1 ) with 


P 2 = ( 0 , 0 , 0 , 0 )'^. 


Example 2. A random sample {(Xi,17)4 = I,-- - ,n} is generated as follows. Eirst, Yi 
is generated from Bernoulli(0.5). Second, if Y* = 1, then Xj is generated from a multivariate 
gamma distribution with mean = (0.55, 0.7, 0.85,1)^ and covariance matrix Si = 0.25 J 4 -f 
diag(0.025, 0.1,0.175, 0.25); if Yi = —1, then Xj is generated from multivariate gamma distribu¬ 
tion with mean P 2 = (0.55, 0.55, 0.55, 0.55)^ and covariance matrix S 2 = O.O 25/4 -f 0.25 J4. The 
multivariate gamma distributed samples are generated with normal copula. 
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Example 3. A random sample {(Xj,yj);i = I,-- - ,n} is generated as follows. First, Xj 
is generated from MVN(/i, E), where p = (0,0, 0,0)^ and S = O. 3 J 4 + O.7J4. Seeond, Yi is 
generated from a logistie model with logit(p(x)) = a;(i) + x‘^^ 2 ) + + ^{ 4 ) ~ 

Example 4. A random sample {(Xj, Yi)]i = 1, ■ ■ ■ , n} is generated as follows. First, Xj is 
generated from E), where p = (0, 0, 0, 0)^ and E = 14 . Seeond, Yi is generated from a 
logistie model with logit(p(x)) = 8 (sin( 0 . 57 ra:(i)) + eos( 7 ra;(i)a:( 2 )) + + 3 a:( 3 )a;( 4 ) + 

In all examples, the sample sizes for training utr and testing rite are set as Utr = 100, 250, 500 
and rite = 2000, respeetively. Eaeh seenario is replieated 100 times. The averaged empirieal 
Youden index J, as well as the eorresponding standard deviations, are summarized in Table 


Table 


about here. 


It is evident that our proposed methods, linear kernel maehine estimation method (LKME) 
and Gaussian kernel maehine estimation method (GKME), yield eompetitive performanee in all 
examples. The performanee of MVN, SWM, and ER is eompetitive in Example 1 as the data 
within eaeh elass indeed follows a Gaussian distribution sharing a eommon eovarianee strueture, 
and thus the linear eombination is optimal. Their performanee beeomes less eompetitive in other 
examples when linear eombination is no longer optimal. It is evident that in Examples 3 and 4, with 
nonlinear patterns speeified, the GKME outperforms all other methods. Espeeially, in Example 4, 
the performanee of GKME is outstanding due to a strong nonlinear pattern speeified. In general, 
the performanee of KSM is less eompetitive. It eould be due to the over-fitting issue when applying 
the Gaussian kernel to estimate sensitivity and speeifieity. With similar exhaustive grid seareh, the 
performanee of SWM is better than MMM in Examples 1 and 4 but worse in Examples 2 and 3. 
As for the two elassifieation methods, ER yields eompetitive performanee in Examples 1 and 2 and 
beeomes less eompetitive when logistie models with nonlinear patterns are applied in Examples 3 
and 4. The performanee of TREE is modest eonsidering the nature of reeursive partition. 
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5 Real application 


In this section, our proposed method is applied to a study of liver disorder. The dataset con¬ 
sists of 345 male subjects with 200 subjects in the control group and 145 subjects in the case 
group. For each subject, there are five blood tests (mean corpuscular volume, alkaline phospho- 
tase, alamine aminotransferase, aspartate aminotransferase, and gamma-glutamyl transpeptidase) 
which are thought to be sensitive to liver disorders that may be related to excessive alcohol con¬ 
sumption, and another covariate with the average daily alcoholic beverages consumption informa¬ 
tion. The corresponding empirical estimates of the Youden index of all six markers are 0.141, 
0.178, 0.174, 0.144, 0.240, and 0.121, respectively. The dataset was created by BUPA Medical 
Research Ltd., and is publicly available at University of California at Irvine Machine Learning 
Repository (https.V/archive. ics. uci. 
edu/ml/datasets/Liver+Disorders). 

The total 345 samples are randomly split into a training set of 200 samples and a testing set 
of 145 samples. We also set 5 = 0.1 and select the tuning parameter A by 5-fold cross validation 
targeting on maximizing Q. The experiment is replicated 100 times, and Figure summarizes 
the averaged performance measures of our proposed method, MMM, MVN, KSM, SWM, LR, and 
TREE. 


Figure 


2 
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It is evident that our proposed method delivers competitive performance in comparison with 
other methods. It is also interesting to notice the significant improvement on diagnostic accuracy 
by combining biomakers nonlinearly. It is encouraging to note that our proposed methods with 
Gaussian kernel outperforms all other methods. 
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6 Closing remarks 

This paper proposes a flexible model-free framework for combining multiple biomarkers. As op¬ 
posed to most existing methods focusing on the optimal linear combinations, the framework ad¬ 
mits both linear and nonlinear combinations. The superior numerical performance of the proposed 
approach is demonstrated in a number of simulated examples and a real application to the liver dis¬ 
order study, especially when the sample size is relatively large. Furthermore, the proposed method 
is especially efficient with a relatively large number of covariates present, where most existing 
methods relying on grid search are often inefficient. Further development could be on estimating 
confidence interval using perturbation resampling procedure [f29l and combining biomarkers under 
covariate-adjusted Youden index setup fflSl . 
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Appendix 

Proof of Proposition!^ Since Ls{u) = Loi{u) + — u)I{0 <u<6), we have 


E(w{Y)Ls{Y{g(X) - c))) = E(w{Y)Loi(Y{g{X) - c))) 


(7) 
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Note that-E(w(y) '^ ^) j(o < y (^(X) — c) < <5)) is decreasing in 6, and approaches 0 when 

5 0. Furthermore, for any given e > 0, 


E{w{Y)Lo,{Y{g{X) - c))) - E{w{Y)Lo,iYig*{X) - c*)) 

f ~ p(x) 


> 


0 7r(l - tt) 

r TT - p(x) 

7r(l - tt) 


/(x)cix+ 


/(x)cix+ 


r p(x) - TT 

'^9,c,on2?9*,c*.o ^(1 “ 

r P(x) - TT 


/(x)dx 

/(x)dx. 


( 8 ) 


By we have 

I TT - p(x) > e, if X G r>3,c,6 n V'g*^c*,e^ 
I p(x) - TT > e, if X G n "Og^cGc- 

Therefore, 


B(m(y)L„i(y(jKA') - 4))) - E{w(YjL„,(Y(g%X) - c*)) > tPr . 

By the fact that £;(TO(y)Lj(y(gJ(X) -cj))) < E(^w{Y)Ls(Y{g‘(X) - c‘))\ we have 

E{w(Y)Lm(Y(gUX) - c|))) - E{w(Y)Lo,(Y(g‘(X) - c')) 

<E(^w(Y)YIIiYSljl£l/(o < Y(g’(X) - c’) < i)). 

It follows immediately that Pr —)■ 0 as 5 —)■ 0. □ 
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Table 1: Simulation examples: estimated means and standard deviations (in parentheses) of the 
empirical Youden index J over 100 replications. 


n = 100 n = 250 n = 500 

Example 1 

LKME 0.604(0.0042) 0.628 (0.0019) 0.641 (0.0018) 
GKME 0.572 (0.0063) 0.604 (0.0029) 0.623 (0.0023) 
MMM 0.455 (0.0032) 0.470 (0.0021) 0.483 (0.0020) 
MVN 0.633(0.0018) 0.638 (0.0014) 0.647(0.0012) 
KSM 0.388(0.0180) 0.458 (0.0104) 0.490(0.0106) 
SWM 0.555(0.0065) 0.594(0.0044) 0.611 (0.0035) 
ER 0.628(0.0022) 0.639(0.0017) 0.646(0.0017) 
TREE 0.490(0.0068) 0.525 (0.0047) 0.559(0.0029) 
Example 2 

EKME 0.636(0.0075) 0.690(0.0025) 0.710(0.0015) 
GKME 0.612(0.0054) 0.654(0.0045) 0.696(0.0016) 
MMM 0.609 (0.0033) 0.622 (0.0025) 0.622 (0.0022) 
MVN 0.573(0.0065) 0.571 (0.0047) 0.563(0.0040) 
KSM 0.214(0.0281) 0.046(0.0164) 0.047(0.0171) 
SWM 0.447 (0.0094) 0.426 (0.0078) 0.429 (0.0065) 
ER 0.648(0.0054) 0.675 (0.0028) 0.678(0.0025) 
TREE 0.433(0.0052) 0.512(0.0039) 0.555(0.0036) 
Example 3 

EKME 0.296(0.0091) 0.367(0.0053) 0.389(0.0049) 

GKME 0.511(0.0052) 0.568(0.0028) 0.592(0.0022) 

MMM 0.423(0.0035) 0.434(0.0021) 0.443(0.0018) 

MVN 0.344(0.0050) 0.371(0.0045) 0.377(0.0041) 

KSM 0.192(0.0085) 0.193(0.0084) 0.202(0.0086) 

SWM 0.370(0.0057) 0.406(0.0028) 0.417(0.0025) 

ER 0.307(0.0043) 0.316(0.0030) 0.320(0.0026) 

TREE 0.424(0.0059) 0.477(0.0042) 0.528(0.0031) 

Example 4 

LKME 0.103(0.0102) 0.150(0.0098) 0.209(0.0089) 

GKME 0.529(0.0078) 0.626(0.0050) 0.682(0.0028) 

MMM 0.184(0.0084) 0.227(0.0034) 0.236(0.0026) 

MVN 0.109(0.0071) 0.152(0.0056) 0.189(0.0054) 

KSM 0.188(0.0050) 0.213(0.0035) 0.220(0.0028) 

SWM 0.255(0.0078) 0.293(0.0050) 0.307(0.0039) 

ER 0.002(0.0023) 0.004(0.0008) 0.011(0.0007) 

TREE 0.257(0.0143) 0.364(0.0111) 0.368(0.0101) 
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Figure 1: The 0-1 loss function, ip loss and ^ 0.5 loss. 









